-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add extra sanity checks in bmp and ico decoders #632
Conversation
…inor optimization. This commit adds some size limits to the decoder, limits the palette buffer size, and limits the starting pixel buffer size to avoid OOM issues. Additionally, this adds a small optimisation to read_full_byte_pixel_data by reducing the number of read calls, which improves the performance of decoding images with plain 8-bit r/g/b/a values. This commit also fixes the compilation of benchmarks which was broken.
Seems I broke the truncate tests for BMP and ICO, didn't notice as they were ignored by default, so I need to fix that before any merging. EDIT: Should be fixed in 487be3e |
Adding more sanity checks is always good. However, I'm not sure about the changes to allocate a smaller initial buffer. After these changes, is it still possible to craft a malformed BMP that will cause an out of memory error or use large amounts of memory? For example, is it possible to craft a RLE BMP with large dimensions but small file size using an end-of-file marker? If so, this does not fix the denial of service attack. |
src/bmp/decoder.rs
Outdated
try!(self.r.seek(SeekFrom::Start(self.data_offset))); | ||
for row in self.rows(&mut pixel_data) { | ||
try!(reader.seek(SeekFrom::Start(self.data_offset))); | ||
//for row in self.rows(&mut pixel_data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix this in the next commit.
The limited starting buffer makes this less severe as images will have to actually be more than 100mb to allocate that much memory, and the decoder will now reject BMPs with x or y values larger than 2^15 bytes and limits the temporary palette buffer to 256 entries (before it wasn't limited). Though, it's by no means a full solution to the issue, this PR is merely a slight improvement on what's currently there.
Good catch, that's not something I had thought of, and RLE-encoded images present an additional problem as it's difficult to estimate their size. The BMP header does have a size field, but it's valid for it to be 0, meaning that it's not set, so it's not something that can be relied on. The documentation doesn't seem to specify what should happen if an EOF marker is hit prematurely and doesn't match up with the length/width values, maybe the file should simply be ended at the end of the row containing the EOF marker. I will have a look at how other decoders deal with this and see if I can improve on it a bit in this PR. Ideally we would supply the decoder with the actual file size and/or some maximum size like many other decoders do (see the comments in #622) and reject files where the dimensions and sizes don't match up. However, that would require an API change (or alternatively you could use specialization for some types of readers but that's still unstable and not an ideal solution), which should probably be planned out before being actually implemented so that is probably better left for a later PR. An issue in general with BMP images is that they are stored upside-down, so incrementally increasing the size of the buffer is a bit slow and clunky, so we might have to make some trade-offs here depending on the use case. For a web browser it's good to be conservative, as we wouldn't expect websites to have large BMPs anyhow, while for an image editor, loading a large BMP might actually be something that would be expected to work. |
Sorry for taking a bit of time to reply. It seems that the behaviour varies a bit. I tested one of the test images that had RLE codes, and modified the size attribute in the header to be very large (12543x23295).
Not sure what the best approach to this is. I think the RLE skipping features are mostly used for icons with transparent pixels (which are limited to 256x256 pixels), so I'm not sure if loading these types of huge images with EOF RLE codes are very relevant anyhow. |
I don't think it's something we need to support loading successfully. But we do need to ensure we don't run out of memory trying to load them. My preference would be for the crate to add an API to specify the maximum image size (and support this for all decoders), but that's a decision for others to make. Then once that is added, the limited starting buffer that this PR implements isn't quite as necessary. Another option would be to implement something like #462. |
Sorry for taking long. I removed the commented code, and added a very basic check for RLE files with an EoF code, so that a file won't extend the buffer from the initial size if an EoF check is encountered. Not sure if there is an easy way to add anything more without having a continuously expanding buffer, but that's a bit complicated since BMP-files are stored upside-down. For now at least, pending any API adjustments, I think it would be advisable for users of the library to check the size of an image using EDIT: A simple change could be to add a |
Ping @philipc |
I'd like @nwin to review this. I still think that specifying maximum image sizes is a better approach, but that's not a decision for me to make. Otherwise this looks fine to me. |
Seems like I produced some conflicts, sorry for that. Can you please resolve them. Looks good otherwise, ready to be merged |
Yup I'll try to sort out those conflicts. |
@nwin I've updated it. Let me know if it I need to squash it (If github won't do it automatically). |
Merging. |
This PR adds some extra sanity checks to the BMP and ICO decoders as requested in #622, #623 and #624. E.g images with x and/or y size values higher than 65535 are rejected, which is what the decoder in Firefox does, and the temporary palette buffer is limited to 256 values (palette indexes above 255 can't be referenced by 8-bit values anyhow). Additionally, for the BMP decoder, the initial image buffer is limited to around 100mb in size and expanded to the full size if needed so that images with a bogus size won't cause memory usage issues as easily.
The changes does seem to have slightly degraded the performance of BMPs using bitfields (which are primarily 16-bit images), but on the flip side, a small change I did made decoding standard 24/32-bit BMPs (which I think is the most common type) a bit faster.
This PR also alters the decoding of RLE-encoded BMPs with delta codes, end of row and end of file codes to set the colour value to [0, 0, 0(, 0)] for the "skipped" pixels as I had to alter some things there anyhow. Before the pixels ended up being white [FF, FF, FF(, FF)] as that was the default value for the bytes in the image buffer, which didn't seem to be the correct behaviour. Now, the behaviour is similar to what the BMP decoder used in windows seems to do. There doesn't seem to be a clear definition of how to handle the skipped pixels so I figured that following what Microsoft themselves do would be the best option and I added a test to check if it works correctly using two example images from the BMP suite (other images from there is already used in the project for other tests and from what I can gather they are public domain). The behaviour does slightly differ from Firefox and chrome for these types of files, which seem to make the pixels transparent while in Image the output buffer won't have an alpha channel except when the BMP is embedded in an ICO.
Lastly, this PR adds two use statements to fix compilation of the benchmarks, as I wanted to make sure I didn't significantly negatively impact performance.